Skip to content

feat(codex): add codex app server adapter + e2e tests#2975

Open
joshsny wants to merge 16 commits into
feat/codex-app-server-adapterfrom
codex/app-server-adapter-2
Open

feat(codex): add codex app server adapter + e2e tests#2975
joshsny wants to merge 16 commits into
feat/codex-app-server-adapterfrom
codex/app-server-adapter-2

Conversation

@joshsny

@joshsny joshsny commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Adds app server implementation for codex adapter and e2e test coverage for the agent package


This is part 2 of 2 in a stack made with GitButler:

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

React Doctor found 1 issue in 1 file · 1 warning.

1 warning

src/features/sessions/components/SteerQueueToggle.tsx

Reviewed by React Doctor for commit 47cd18c.

@joshsny joshsny force-pushed the feat/codex-app-server-adapter branch from 00acd92 to 8899ff0 Compare June 29, 2026 09:47
@joshsny joshsny force-pushed the codex/app-server-adapter-2 branch from c62e8b0 to 492b304 Compare June 29, 2026 09:47
@joshsny joshsny marked this pull request as draft June 29, 2026 09:47
@greptile-apps

greptile-apps Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "improve coverage of app server" | Re-trigger Greptile

Comment on lines +29 to +35
const ADAPTERS: Adapter[] = ["claude", "codex"];

const EDIT_PROMPT =
"Do exactly these steps and nothing else: 1) Read the file target.txt. " +
"2) Edit it so the second line reads FOO instead of line2. " +
"3) Run the shell command `cat target.txt`. " +
"4) In one sentence confirm what you changed, then stop.";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 for-loop over adapters instead of parametrised test

The for (const adapter of ADAPTERS) loop is a hand-rolled substitute for describe.each. The team's explicit rule is to prefer parametrised tests. The same pattern appears in structured-output.e2e.test.ts. Both files generate one describe block per adapter through a loop; describe.each expresses that intent directly and is idiomatic in Vitest.

Context Used: Do not attempt to comment on incorrect alphabetica... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the for loop here deliberately: each adapter needs a block-level describe.skipIf(E2E.skipReason(adapter)) with the skip reason baked into the title, and describe.each can't express a per-case skip condition (skipIf chains before .each, so the condition can't depend on the adapter). Converting would mean per-it skips plus a guarded beforeAll, which weakens the skip semantics for no coverage gain. The repetitive outcome tests in approvals.test.ts are now it.each though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a valid constraint — describe.skipIf in Vitest resolves the condition before .each binds the parameter, so you can't make the skip predicate a function of the per-row adapter value. The for-loop pattern is the correct workaround here, and the skip semantics (entire describe block skipped with the reason surfaced in the title) are genuinely better than per-it guards.

Fair point, withdrawing the comment.

Shall I remember this as a rule: "prefer a for-loop over describe.each when each adapter needs a block-level describe.skipIf whose condition depends on the adapter value"?

Tip: You can customize Greptile's behavior for this repo with .greptile/rules.md and .greptile/config.json.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the for-loop here deliberately: each adapter needs a block-level describe.skipIf(E2E.skipReason(adapter)) with the skip reason baked into the title, and describe.each can't express a per-case skip condition (skipIf chains before .each, so the condition can't depend on the adapter). Converting would mean per-it skips plus a guarded beforeAll, which weakens the skip semantics for no coverage gain. The repetitive outcome tests in approvals.test.ts are now it.each though.

Comment on lines +31 to +90
{ outcome: "selected", optionId: "option_1" },
]);

const params = {
threadId: "t",
turnId: "turn",
itemId: "item-9",
autoResolutionMs: null,
questions: [
{
id: "q1",
header: "Pick one",
question: "Which environment?",
isOther: false,
isSecret: false,
options: [
{ label: "staging", description: "" },
{ label: "production", description: "danger" },
],
},
],
};

const result = await handleServerRequest(
APP_SERVER_REQUESTS.TOOL_USER_INPUT,
params,
client,
opts,
);

expect(result.handled).toBe(true);
expect(result.response).toEqual({
answers: { q1: { answers: ["production"] } },
});

// Prompt carried the question's options and the session id.
expect(calls).toHaveLength(1);
expect(calls[0].sessionId).toBe("sess-1");
expect(calls[0].options.map((o) => o.name)).toEqual([
"staging",
"production",
]);
});

it("defaults a cancelled question to an empty answer", async () => {
const { client } = fakeClient([{ outcome: "cancelled" }]);

const params = {
threadId: "t",
turnId: "turn",
itemId: "item-1",
autoResolutionMs: null,
questions: [
{
id: "q1",
header: "h",
question: "q?",
isOther: false,
isSecret: false,
options: [{ label: "a", description: "" }],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Repetitive outcome tests could use it.each

The allow / reject / cancel outcome tests for each of the three request types (TOOL_USER_INPUT, PERMISSIONS_APPROVAL, MCP_ELICITATION) share the same structure: queue an outcome, call handleServerRequest, assert the response. Grouping them as it.each rows would express the same coverage more concisely, consistent with the team's preference for parametrised tests.

Context Used: Do not attempt to comment on incorrect alphabetica... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +63 to +73
if (block.type === "resource" && "text" in block.resource) {
const uri = block.resource.uri ?? "";
if (uri.startsWith("file://")) {
input.push(textInput(resourceLinkText(uri)));
continue;
}
input.push(textInput(uri));
context.push(
`<context ref="${uri}">\n${block.resource.text}\n</context>`,
);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Empty text block emitted when resource has no URI

When block.resource.uri is undefined or null, block.resource.uri ?? "" yields "" and textInput("") is pushed into the input array. Codex may reject a turn/start payload containing an empty-string text block, and the empty string carries no useful information — the resource content is already in the trailing <context ref=""> block. Skip the bare-URI push when uri is empty. This path has no unit test coverage.

Suggested change
if (block.type === "resource" && "text" in block.resource) {
const uri = block.resource.uri ?? "";
if (uri.startsWith("file://")) {
input.push(textInput(resourceLinkText(uri)));
continue;
}
input.push(textInput(uri));
context.push(
`<context ref="${uri}">\n${block.resource.text}\n</context>`,
);
}
if (block.type === "resource" && "text" in block.resource) {
const uri = block.resource.uri ?? "";
if (uri.startsWith("file://")) {
input.push(textInput(resourceLinkText(uri)));
continue;
}
if (uri) {
input.push(textInput(uri));
}
context.push(
`<context ref="${uri}">\n${block.resource.text}\n</context>`,
);
}

@joshsny joshsny requested a review from a team June 30, 2026 17:35
@joshsny joshsny force-pushed the codex/app-server-adapter-2 branch 2 times, most recently from 14877aa to 7d28bad Compare July 1, 2026 18:00
@joshsny joshsny marked this pull request as ready for review July 1, 2026 18:00
@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Reviews (2): Last reviewed commit: "remove some comments" | Re-trigger Greptile

Comment on lines +997 to +1014
if (response.outcome.optionId === "reject_with_feedback") {
// codex's response has no feedback field, so decline and inject the guidance
// into the running turn (as its TUI does: Denied + a follow-up message).
const feedback = (response as { _meta?: { customInput?: unknown } })
._meta?.customInput;
const activeTurnId = this.turns.activeTurnId;
if (typeof feedback === "string" && feedback.trim() && activeTurnId) {
void this.rpc
.request(APP_SERVER_METHODS.TURN_STEER, {
threadId: this.threadId,
input: toCodexInput([{ type: "text", text: feedback.trim() }]),
expectedTurnId: activeTurnId,
})
.catch((err) =>
this.logger.warn("turn/steer (reject feedback) failed", err),
);
}
return { decision: "decline" };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Stale activeTurnId after feedback steer

The reject_with_feedback path fires turn/steer as a fire-and-forget void call and never calls this.turns.onSteered(newTurnId) with the rotated turn ID that codex returns. After this steer completes, activeTurnId still points to the old (now invalid) turn ID. Any subsequent operation that reads activeTurnId — a second rejection with feedback, an interrupt, or a concurrent prompt() call that hits the turns.isRunning steer branch — will supply a stale expectedTurnId and codex will reject the request with a turn-ID mismatch error.

The main prompt() steer path shows the correct pattern: await the steer response, then call this.turns.onSteered(steerRes?.turnId) to keep the turn controller current.

@joshsny joshsny force-pushed the feat/codex-app-server-adapter branch from 8899ff0 to 589557c Compare July 2, 2026 12:19
@joshsny joshsny force-pushed the codex/app-server-adapter-2 branch from 7d28bad to 78af4dc Compare July 2, 2026 12:19
joshsny and others added 7 commits July 3, 2026 18:16
- adopt the rotated turn id after a feedback steer so later interrupts/steers target the live turn
- skip the empty bare-uri text block for uri-less resources
- parametrize the approval outcome tests with it.each
- add @openai/codex to the lockfile (frozen-lockfile install was failing every job)
- drop a committed vite timestamp temp file and fix biome format errors

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@joshsny joshsny force-pushed the codex/app-server-adapter-2 branch from 5754c6f to 47cd18c Compare July 3, 2026 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant